Segmentation of images using the watershed method

ABSTRACT

A method of determining a unique number of colors for an image is described which uses homotopic transformation (in particular the watershed transformation) within feature space formed by red, green and blue components, or any other set of attributes, of the pixels of the image. Each color is given a label and the image is then segmented into regions of uniform labels. The color or characteristic of each segment can then be altered or used to identify an object. The method can be applied to any scalable, either integer or real, data set.

BACKGROUND OF THE INVENTION

In recent years, vast strides have been made in the field ofcomputer-assisted image processing. The creation and manipulation ofimages has proved a boon to many engaged in the graphic arts field,industrial monitoring, and surveillance, but there are still problems inthe initial stages of rendering an already existing image intoprocessable form. The classic approach to securing a computerised imageis to scan a photographic original to form a file in which data arestored representing properties of a large number of portions of theimage, so-called pixels. Each pixel is characterised by a number ofparameters corresponding to colour and intensity, and the file containsdata relating to the location of each pixel so that when the file iscalled up by an appropriate program, the image is displayed on screen.Most recently, the process of scanning has been supplemented by thedevelopment of so-called digital camers, which produce an image filedirectly.

In order to process the image to the form desired by the user, it oftenneeds to be broken down into different parts, for example thosecorresponding to background and displayed object, in order to change thecolour balance of the background without affecting that of other partsof the image. This process of segmentation is time-consuming andrequires a high degree of skill. Attempts to automate the process havebeen made, but they do not work well or easily, as the intellectuallycomprehensible pieces of an image, clear to any human viewer, are simplynot easily identifiable by a computer.

SUMMARY OF THE INVENTION

The present invention seeks to provide a method of analysing the data inan image file to yield information quite independent of humanintervention. It seeks to enable patterns or structures within a dataset, if such exist, to be revealed and used, both to describe the dataand to make predictions if such patterns recur. The method does notdepend upon superimposed assumptions based upon current theory andknowledge. It is implemented by the use of a computer system, and thuscan enable that computer system to receive or gather data about theexternal world, either by importation of a picture of that world, or bydirect input from a suitable camera” system, and then to analyse suchdata without preconceptions. The system can thus arrive at anunderstanding of the world, and, if desired enable accurate predictionsto be made about the world. Such a system can thus be seen as forming abasis for machine intelligence.

Several examples in which a digitised image is segmented on the basis ofcolour by a method according to the invention are set out in examples 1,2 and 3.

The invention essentially uses homotopic transformations, specificallythe watershed transformation, within feature space formed by the colourcomponents of the pixels of an image. These may be represented in manycolour spaces, such as HLS (Hue, Lightness, Saturation), RGB (Red,Green, Blue), or CYMK (Cyan, Yellow, Magenta, Key). Images may beprocessed in their original colour space, or transformed into adifferent one before processing. Furthermore, additional channels ofinformation may be generated algorithmically from the data, for exampleby performing an analysis relating to texture for each pixel, andincluded in the classification process. Additionally, rather thanperforming only one classification, in all the dimensions at once, thereexists the option of performing several classifications, each in asubspace of the feature space, and of then making reference to some orall of the classifications in the segmentation process. This enables aunique number of colours, groups of colours, or contiguous regions infeature space, hereinafter called classes”, to be found for that image.Each class is given a label and the image is then segmented into regionsof uniform labels. The colour of each segment can then, for example, bealtered or used to identify an object.

The method can be applied to any scalable (either integer or real) dataset. While the usual number of dimensions for the histogram is three, itis of course possible to use more, or fewer, if desired, but use of morethan three dimensions materially increases the amount of computing powerand computer memory required to carry out the necessary analysis. Itshould be noted that in using the method, the lattice resolution andconnectivity will both affect the number of sets (as defined below)found.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates steps of a method of segmenting a digital imageformed from an array of pixels according to the invention.

FIG. 2 illustrates the sub-steps of the assigning of the labels of thesets of parameter values of FIG. 1.

FIG. 3 illustrates a further sub-step to the method of FIG. 2.

FIG. 4 illustrates a method of determining the second label of FIG. 2.

FIG. 5 illustrates an additional step to the method of FIG. 1.

DESCRIPTION OF PREFERRED EMBODIMENTS

In order to understand the theory behind the method, it is necessary tobear in mind the following definitions:

Lattice—A set of nodes and connections between the nodes. An Euclideanlattice is a square grid pattern, with the nodes being theintersections, and the lines between them representing the connections.For a simple 2-dimensional Euclidean lattice, there are two possibleways of defining the connectivity. a) The 4-connected lattice consistsof allowing nodes to be adjacent only if they differ by only one on onedimension. This means that each node on the lattice has 4 neighbours,hence the name. b) The 8-connected lattice consists of allowing nodes tobe adjacent if they differ by only one on any number of dimensions,including the case where the node differs by one on both dimensions.Thus the connectivity includes the diagonal nodes and there are in thiscase 8 nearest neighbours, and hence the name. The nature of any latticeand its connectivity can be defined for any number of dimensions byextension of the 2-dimensional case.

Geodesic distance—Any distance on a lattice must be measured along acontinuous string of adjacent nodes. The string of adjacent nodes isknown as a path. The geodesic distance between any 2 nodes is the path,out of all possible paths between the nodes, which has the minimumdistance.

Set—A collection of nodes, each of which is adjacent to at least oneother node in the group. Thus, you can, by moving only between adjacentnodes on the lattice, get from any member of the set to any other.

Jordan's Lemma—A process (such as adding/subtracting a node to a set)preserves homotopy if nodes that were previously connected by a pathremain connected, and those that were not connected by a path remaindisconnected. Thus, if there exist n sets before the operation, theremust only be n sets after the operation. Processes which satisfy thehomotopy condition, if used exclusively, will guarantee that if thereare n sets in a data set represented on a given lattice, then only nsets will be ‘discovered’ or revealed at the end of applying theprocesses.

Monotonic—A series of numbers in which each number is either as great(small) or greater (smaller) than its predecessor.

Fall-set—This is a path in which the numerical value, of the nodesvaries monotonically, starting from a high value. If the path werefollowed from its high end to its low end, then by analogy with waterflowing down a hillside this would describe how the water flowed. Thenodes to which no water flows, and from which water flows, define a‘watershed’. Hence, the algorithms that discover these nodes are knownas ‘watershed’ algorithms.

The method of the invention is based on the idea of a fall-set, geodesicdistance, and Jordon's Lemma. Jordon's Lemma gives the legitimatetransformations of sets such that homotopy (number of sets) ispreserved. Definition of a set: All those points that are connectedcontinuously. Contintuity is defined as being equivalent to adjacency onthe lattice. Therefore, a set is all those points that are connected bya string of points that are members of the set. The Lemma states that apoint can be added (or subtracted) provided that no point previouslyunconnected becomes connected (or any point previously connected becomesdisconnected). A transformation that satisfies this restriction willconserve homotopy or keep the number of sets constant. By selecting as aseed point the highest point of the histogram which must belong to a setand then adding to it adjacent points that are lower and uniquelyconnected to that initial point (thus satisfying Jordan's Lemma), thefall set for that seed point will be discovered and defined. Theidentification by the algorithm of other seed points that areindependent and uniquely definable ensures that if there are n sets(groups, classes) in the feature space only n will be found. By usingthe geodesic measure from the seed point for every point in the set,those points on the lattice that may be connected to several sets, andare therefore boundary points, can be assigned to the group they arenearest to in terms of geodesic distance from the seed point of thatgroup.

The process is thus capable of discovering the number of homogeneouscolours there are in any image. Once the number and precise definitionof the groups is known then the original image can be segmented intohomogeneous regions.

The method of the invention provides segmentation in a fashion which isqualitatively distinguished from previous approaches, which have usedstatistical decision theory. Instatistical decision theory, the numberand mathematical description of the sets within the data, or thedistributions within a feature space are assumed to be known. Further,it is assumed that the shape of each set is expressible as a continuousmathematical function. All these assumptions are totally erroneous.There is no rational procedure to calculate the number of colours in animage. The shape of the sets within the feature space (histogram) isnever regular either in outline or in profile. The data are discrete andin no respect can such data be treated as continuous. At best, thetraditional statistical decision theory approach is a very poorapproximation to the actual distributions within feature space, and thisinevitably leads to errors in assigning pixels to a set. In contrast,the method according to the invention describes the data accurately andtherefore cannot be bettered, merely equalled in accuracy. It is theonly method that can find an answer to the problem of ‘how many sets orcolours are there in any one image?’. The answer for any givenresolution (radiometric) and lattice connectivity is unique.

Instances of a data set, measured on at least interval scales, can berepresented within a feature space, the axes of which are the dimensionsof the data. Once the data is so represented, then the watershedalgorithm can be used to describe precisely the number, size and shapeof the independent sets (classes, groups) of the data. This is the basisof knowledge. Once the division of the feature space into itsconstituent sets (classes) has been completed, then it only remains toassign significance to each set and then to make predictions on thebasis of a point belonging to a particular set. The watershed algorithmis thus a rigorous method of describing data for the purpose ofprediction. It can also be used to segment (determine to which colourgroup a particular pixel belongs) any data set. It is unique in beingable to describe any data set precisely in terms of the number, shape,and size of its components. No other method can ever do better than,only at best duplicate, the watershed solution.

The watershed algorithm used in the present invention can be applied toany data set that has the following characteristics:

-   1. The number of instances is large.-   2. The component measures for each instance within the data set are    all at least interval.

These are not severe restrictions and therefore the method can beapplied to most data sets.

The invention is illustrated by way of the following example, which is,to conserve space and aid clarity of understanding, concerned with adata set microscopic compared with any real data set representing apixellated image. However, it is believed that it serves to illustratehow the method is applied. The example describes the processing insimple terms, but in real implementation, all of the processes arecarried out using appropriate computer programming.

EXAMPLE 1

In a first example, let us suppose we have a small image of 5 by 5pixels, and that we have the red and green values for each pixel. Thiswill define a 2-dimensional problem, and we would like to segment theimage into its constituent ‘colours’, purely on the basis of the patternof adjacent points within the feature space defined by the values of thetwo colours, red and green. Tables 1 and 2 show the components of animage, one for the red component, and the other for the green component.

TABLE 1 Red Component 4 3 4 4 5 5 4 1 2 1 5 3 0 1 1 1 1 0 1 2 0 1 1 1 1

TABLE 2 Green Component 1 1 1 0 1 2 1 5 4 3 1 0 4 5 3 5 4 5 3 4 5 5 4 44

The first step is to construct a hash table (the most economical methodof storing sparse data) containing in the following order:

The hash table position, the red component, the green component, thenumber of pixels having these 2 components, the label for the class(set).

The hash function in this example isL=Mod((R÷(÷)7*G):13)where R is the red value, G the green value, and L is the remainderafter dividing by 13 the result of R+7*G. (1, 7, and 13 are primenumbers: the hashing technique of storing data works best with primenumbers, and the choice of the prime numbers in the hashing functiondepends on the range of values to be stored.) Each pixel is taken inturn and the hash table is constructed with the entry for eachcombination being incremented each time that combination occurs, so thatthe final value will give the height of the histogram for thatcombination. As each pixel is entered into the hash table, the red andgreen components are checked, and if either differs the hash table keyis incremented until an empty field is found, into which the colour isadded. The result is shown in Hash Table 1:

HASH TABLE 1  1  2 0 4 1  3 1 4 4  4 2 4 2  5 3 0 1  6 4 0 1  7 5 2 1  83 1 1  9 0 5 2 10 1 3 3 11 1 5 4 12 4 1 3 13 5 1 2

HASH TABLE 2  3 1 4 4 11 1 5 4 10 1 3 3 12 4 1 3  1 2 4 2  9 0 5 2 13 51 2  2 0 4 1  5 3 0 1  6 4 0 1  7 5 2 1  8 3 1 1  1

HASH TABLE 3  1  2 0 4 1 g  3 1 4 4 g  4 2 4 2 g  5 3 0 1 r  6 4 0 1 r 7 5 2 1 r  8 3 1 1 r  9 0 5 2 g 10 1 3 3 g 11 1 5 4 g 12 4 1 3 r 13 5 12 r

The feature space will be considered as an Euclidean lattice that isfour connected. In order to facilitate the procedure, the hash table isreordered using the fourth column (histogram height) such that thehighest values occur at the beginning of the table. The result is asshown in Hash Table 2. Note that if two hash table entries differ by 1in their red or green components (but not both) then they are consideredto be adjacent. If a candidate entry is being considered as belonging toan already existing group, then provided it is adjacent to one entry ofthe group and not adjacent to any entry that belongs to another group,then it can be added to that group. This procedure conforms to Jordan'sLemma.

Starting at the beginning of the table, column 4, representing thehistogram value for each colour, is scanned to find the maximum value.The maximum is found to be 4, and there are two such fields to consider.These differ in colour by 1 in one dimension only and are thus adjacent,and are given the same label g. Column 4 of the hash table is scannedfor any field containing the histogram value 3. There are two. The first(position 10 in table 1) differs in colour from one of the 4s (position3 in table 1) by 1 in only the green component and therefore can beassigned to the same group g. The second 3 (position 12 in table 1)differs by more than one from all the already examined entries andtherefore this position is not adjacent to the g labelled positions, andis thus part of another set, and is given the label r. There being nomore 3s to consider, column 4 of Hash Table 1 is next scanned for 2s.There are four, namely positions 4, 9, 10, and 13. That in position 4differs by one in one dimension from that in position 3 and is thereforeassigned label g. Position 9's entry is adjacent to position 11's,position 10's is adjacent to position 3's, and position 13's is adjacentto position 12's. None of these positions is adjacent to positions thatbelong to more than one group and therefore each can be added withoutviolating Jordan's Lemma. Each is given the appropriate label. All theentries with histogram value 2 have been accounted for so the is arelocated in column 4. There are five, in positions 2, 5, 6, 7, and 8 ofHash Table 1. Position 2's is adjacent to two, those at positions 3 and9 of Hash Table 1. Both these have the same label, so that in position 2can be assigned to that label. Position 5's entry is not adjacent tothat of any already labelled position so is left unlabelled. Positions6, 7, and 8 have entries which are adjacent to an already labelledposition and are therefore assigned the same label. Position 5 is againexamined, and its entry is now found to be adjacent to that of twopositions, 8 and 6. Both these have the same label so the positionacquires that label, r. The result can be seen in Hash Table 3.

If a position is found to be adjacent to two (or more) positions whichdo not have the same label, it is a boundary node, and is given thelabel of the set to which it is closest, in the following sense. Thegeodesic distance to the first assigned position is computed for each ofthe sets and the smallest found. If the distances are equal the positionis assigned to the first found group. The assignment of boundary pointsis not necessary to the procedure—it is a convenience that ensures aclassification label is attached to every data combination. It is ofcourse possible to leave these boundary combinations unlabelled and dealwith them in some other manner.

The appropriate label can now be assigned to each pixel by computing thehash table key for that pixel and reading the label from the hash table.The result is as follows:

TABLE 6 Final ‘Segmented’ Image r r r r r r r g g g r r g g g g g g g gg g g g g

TABLE 7 Feature Space (Histogram) 0 1 2 3 4 5 0 0 0 0 0 1 2 1 0 0 0 2 44 2 0 0 0 0 2 0 3 1 1 0 0 0 0 4 1 3 0 0 0 0 5 0 2 1 0 0 0

Table 7 is the histogram or feature space, as it would normally berepresented. Using the fall set idea it is easy to see that there arejust two sets in the data. When more dimensions are used the differencein time taken to scan the two tables (13 checks for the hash table, and25 for the histogram in this case) becomes increasingly great, such thatonly a hash table method is feasible.

EXAMPLE 2

In a second example, the present invention provides a method using acomputer for segmenting an image into a small number of homogeneousregions on the basis of the colour, the method comprising the steps of:

-   1. Digitising a source image to generate a digitised image file,    comprising an n-dimensional map of m-tuples. Each of which    represents the colour value at that point in the image.-   2. Forming an m-dimensional histogram of colour frequency in the    digitised image file.-   3. Sorting the entries in the histogram by height.-   4. Choosing a point that has the highest histogram value attained by    any unlabelled point.-   5. Assigning a unique label to this point as follows:    -   i. If no previously labelled point is adjacent. assign a new        label and geodesic distance 0.    -   ii. If there are adjacent elements with the same label, assign        this label and determine the geodesic distance to be the same        (if histogram value is the same) or one greater than (otherwise)        the least value held by the neighbour having the highest        histogram value.    -   iii. If there are neighbours that have different labels, ignore        this point at present.-   6. Finding any other points having the same height and treating them    as per 5.-   7. Assigning to each remaining point at this level (as ignored at    5iii) (i.e., each point which is not uniquely connected) the label    of whichever of its neighbours has the lowest geodesic distance.-   8. Repeating steps 4 to 7 until all points within the feature space    have been assigned a label.-   9. Assigning to each pixel in the image the label that is found for    that combination of component values within the feature space.-   10. Giving each region of uniform labels a unique label to identify    that segment.

EXAMPLE 3

In a third example, the present invention provides a method for imagesegmentation which comprises the steps of:

-   1. creating or digitising an image, consisting of an N-dimensional    map of data, each element of which contains the M-dimensional colour    data (the value at that point in the image of each of the M colour    components)-   2. optionally transforming the image from its original colour space    to another with the same or differing dimension (M), and/or adding    further components (to increase M) generated algorithmically from    the original data, e.g. ‘texture’, and/or scaling each component by    the same or different amounts to alter the number of unique values    in that band,-   3. Making a histogram of frequency of each unique combination of    component values (‘colour’), preferably by constructing a hash    table, wherein elements of the histogram are ‘neighbours’/‘adjacent’    if their component values differ by no more than one in no more than    a specified number of dimensions.-   4. optionally sorting the histogram's entries into buckets, ordered    by histogram value, largest first.-   5. For each histogram value found, starting from the highest and    working down, consider the set of colours SC with that histogram    value, performing steps 6 and 7.

1. A method for segmenting a digital image formed from an array ofpixels, visual characteristics of each pixel being defined by a set of nparameters, each possible unique set of parameter values beingrepresentable as a node on an n-dimensional lattice, the methodcomprising steps of: storing frequency of each said unique set ofparameter values occurring in a set of pixels forming the image;assigning a label to each said unique set of parameter values, taken inorder of frequency; and determining an image segment comprising a set ofpixels whose visual characteristics are defined by those sets ofparameter values which have been assigned one or more specified labels;wherein said assigning of the label to each said unique set of parametervalues comprises sub-steps of: assigning a first label to a set ofparameter values if the node representing that set of parameter valuesis adjacent to at least one node representing a set of parameter valueswhich has been assigned the first label, but which is not adjacent toany nodes representing sets of parameter values which have been assignedother labels; wherein sets of parameter values having a common frequencyare reprocessed according to this sub-step until none of the nodesrepresenting unlabelled sets of parameter values is adjacent to at leastone node representing a set of parameter values which has been assignedthe first label but which is not adjacent to any nodes representing setsof parameter values which have been assigned other labels; and assigninga new label to a set of parameter values if the node representing thatset of parameter values is not adjacent to any node representing alabelled set of parameter values.
 2. A method according to claim 1,wherein said assigning of the label to each said unique set of parametervalues comprises a further sub-step of assigning a second label to a setof parameter values comprises if the node representing that set ofparameter values is adjacent to nodes representing sets of parametervalues that have been assigned different labels.
 3. A method accordingto claim 2, wherein the second label is one of the different labels. 4.A method according to claim 2, wherein the second label is determinedaccording to steps of: (a) determining those sets of parameter valueswhich were first to be assigned each label; (b) determining which of thenodes as determined in (a) is closest to the node representing the setof parameter values to which the second label is to be assigned; and (c)determining the second label to be the label assigned to the set ofparameter values represented by the closest node as determined in (b).5. A method according to claim 1, wherein the set of pixels whose visualcharacteristics are defined by those sets of parameter values which havebeen assigned one or more specified labels is a connected set of pixels.6. A method according to claim 1, wherein the frequency of each saidunique set of parameter values occurring in the set of pixels formingthe image is stored in a hash table.
 7. A method according to claim 1,wherein the set of parameters includes parameters defining color of apixel.
 8. A method according to claim 1, wherein the image is processedto modify parameters or to generate additional parameters.
 9. A methodaccording to claim 1, wherein the set of parameters includes parametersdefining texture of the image at a pixel.
 10. A method according toclaim 1, wherein greater than three parameters are present.
 11. A methodaccording to claim 1, wherein the set of parameters is a sub-set of atotal set of parameters associated with each pixel.
 12. A methodaccording to claim 1, wherein the image segment represents an objectportion or background portion of the image.
 13. A method according toclaim 1 further comprising changing the visual characteristics of aportion of the image defined by the image segment independently of otherportions of the image.
 14. A method according to claim 1, wherein eachnode represents two or more sets of parameter values.
 15. A methodaccording to claim 1 further comprising reassigning labels to sets ofparameters that have been assigned a third label when the number of setsof parameters assigned the third label is below a threshold.
 16. Asystem constructed and arranged to carry out the method of any one ofclaims 1 to 15.